Journal of Public Economics 6 (1976) 55-75. © North-Holland Publishing Company 


THE DESIGN OF TAX STRUCTURE: DIRECT VERSUS 
INDIRECT TAXATION* 


A.B. ATKINSON 
University of Essex, Wivenhoe Park, Colchester, England 


J.E. STIGLITZ 
Stanford University, Stanford, CA 94305, U.S.A. 


Revised version received February 1976 


1. Introduction 


The recent literature on optimal taxation may be seen as attempting to clarify 
the structure of the arguments advanced to support changes in the tax system, 
tracing the implications of taxes and quantifying (analytically) the trade-offs 
between the various objectives of tax policy. This literature has examined the 
optimal structure for particular types of taxation taken in isolation, such as the 
optimal rates of excise tax and the optimal income tax schedule. Our purpose, 
on the other hand, is to provide a broader framework and to consider the inter- 
action between different kinds of taxation. To illustrate this, we reexamine the 
age-old question of direct versus indirect taxation and the relationship of these 
taxes to the goals of efficiency, vertical equity and horizontal equity. 

After describing in section 2 the general framework of the analysis, and arguing 
that any treatment of the choice of tax structures must be centrally concerned 
with distributional considerations, we begin in section 3 with the extension of 
the classic Ramsey formula for optimal excise taxation to include vertical equity 
objectives. This was considered by Diamond and Mirrlees (1971), but the results 


*This is a revised and condensed version of the paper given at the ISPE meeting under the 
title ‘Alternative approaches to the distribution of income.’ This in turn was based on part II 
of ‘The structure of indirect taxation,’ Cowles Foundation, 1970 [part I appeared as Atkinson 
and Stiglitz (1972)] and on the draft of chapter 15 of Lectures on Public Economics, University 
of Essex, 1971. Parts of the paper have been presented by the first author at seminars at the 
universities of Essex, Harvard and Namur, and by the second author at Chicago, National 
Bureau of Economic Research-West and Stanford, and they are grateful to participants in 
these seminars for their helpful comments. This work was supported in part by National 
Science Foundation Grant SOC74-22182 at the Institute for Mathematical Studies in the 
Social Sciences at Stanford University, and in part by the Guggenheim and Ford Foundations. 
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given here are in a rather different form.’ The rest of the paper is concerned with 
the case where the government can employ both income and excise taxes. In 
section 4 it is shown that the existence of an optimal linear income tax may lead 
to quite different results. Section 5 introduces the possibility of a general non- 
linear income tax, and argues that under a relatively wide class of conditions — 
separability between leisure and consumption — the optimal tax system can rely 
solely on income taxation. This brings out clearly the importance of considering 
simultaneously the whole range of tax instruments open to the government, 
Finally, section 6 examines the relationship between vertical and horizontal 
equity, and the implications of differences in tastes. 


2. The basic framework for taxation 


The general problem of taxation of individuals may be posed as follows. There 


are 
of characteristics, in particular their endowments and tastes. On the basis of 
certain ethical premises, it is decided that éndividualsywithwdifferenticharacter 
If we could observe these charac- 
teristics costlessly and perfectly, that would be the end of the analysis: we would 
simply impose a lump sum tax on individuals, with the amount differing ac- 
cording to their characteristics. The theory of optimal taxation would then be 
concerned simply with deriving, on the basis of the specified ethical premises, 
what the functional relationship between characteristics and taxes ‘ought to be.’? 


-It is the difficulties associated with observing characteristics which make the 
. The theory may be seen 
as being concerned with the 


Ww 

It is thus part of what has come to be called the ‘theory 
of screening.’ The use of these surrogate characteristics gives rise to a number 
of problems similar to those discussed in the screening literature [see, for 
example, Spence (1973) and Stiglitz (1975)]. 


(1) Many of the characteristics which may be used for screening are, at least to- 
some exen under thelcontrolofthelindividialjand basing a tax on these 
is anmitan 

(2) Almost all characteristics which may be used for screening are imperfect; 


that is, the surrogate characteristics employed to determine tax liability are 


niotiperfectiyicorrelatedgwith the characteristics with which we are really 


concerned. 


1This section includes the distributional results referred to in our earlier paper [Atkinson 


and Stiglitz (1972)]. 
2Another potentially important function of the tax system ~ to provide signals concerning 


the demand for public goods — is not discussed here. 


WM 
~ 
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(3) Therelarelleosts (e.g. of administration) associated Withmeven mOndistor> 
tionary screening systems. — 


This general view of taxation shows that the analysis of tax systems must be 
inherentlysconcermed|withundividualidifferences; As a consequence, the treat- 
ment of, say, optimal excise taxation in a world where individuals are assumed 
to be identical is at best of limited relevance. In what follows we assume that 
people differ with respect to their abilities (earning power) and their tastes, 
although for the main part of the paper (sections 3-5) we concentrate on differ- 
ences in ability. For simplicity, we assume that this can be measured by a single 


parameter, n, so that an ifidividiallofabilitymyplican olin /myhoursiwhatanp 
inf@ iG UAN OAS ainako We assume, however, that 
‘ability is not observable directly. What one can identify depends on the nature 


of the employment relationship. The following are three of the most important 


possibilities, where Z is the number of labor hours worked, e is the level of effort, 
and income is given by Y = neL. 


(i) ee The makes s sense 


tad hnoimaggag althn nanagone 
wu businesses, aitnd h AVLeLOOAL ior ploy ees 


(ii) TR 
is applies where individuals may have several jobs and it 


may be difficult to keep track of them. It should be noted that where effort 
is unobservable, one cannot infer ability, even when one can observe the 
wage rate. 


(iii) Both wages and hours are observable, but since effort is not, ability cannot — 
‘be inferred. 


far uni in 
IOl umin 


Case (i) corresponds to that where income taxation is employed (Y is the 
surrogate characteristic), case (ii) to that where there is a wage tax (w is the 
surrogate characteristic), and in case (iii) there is a choice of screening devices. 


In addition to income and wages 


In a world where income and wages are unobservable, but purchases of certain 

the latter may provide the best screening device. 
Whether such purchases remain good screening devices when income and wages 
are observable is one of the questions to which we address ourselves in this 
paper. Still other economic variables that may be useful as screening devices 


are the sourcesyofincome: e.g. the government could distinguish between 


u . For the purposes of 


this paper. however. 
. There are certain 


other distinctions, such as the sex, age, and marital status of the worker, which 
are relatively costless to observe 


. An argument can be made for differentiation 
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on this basis [re Boskin (1973)], but again, for present purposes, Welignore) 


S. 
Thus,a indivi we can describe 


‘i = T(x, Y,w). | 


In practice, almost all tax systems possess a high degree of separability, and 
indeed are often linear in some or all of the arguments. There are good reasons 
why this is so. Not only are there greater costs of calculating tax liabilities when 
nonseparable and nonlinear tax systems are employed, but also there are 
significantly higher costs of record-keeping and enforcement (with linear 
commodity taxes, for example, no record of the number of units purchased by 
a given person need be kept). Thus although ili i ity, have 
great analytical advantages, and wi , there 
are also strong economic grounds for making these assumptions. 
Within this framework, we can consider the following taxes. 


Excise tax: T= ty = tx, 


i 


sere ms denotes the inner ae of the two vectors. Inlithelsimplesticase> 


the tax may be nonproportional. Taxes may also be income-related, t(x; Y), 
OP WAPETEIAEANW)) the latter applying, for example, to job-related sub- 
sidies. 

< Income tax: T= T(Y). 
In certain cases the tax base may depenidlonithelconsumptionloficommiodities 


(e.g. medical care), so that T = T(Y, x,); it may be constrained to be linear 
(constant marginal tax rate) or allowed to vary freely. 
Wage tax: T = c(w)L. 
Again the tax schedule may be constrained to be linear. (The problem of the 
optimal wage structure in a socialist country may be viewed as determining the 
function t.) 

Thus 


the theory of optimal taxation must be concerned with the choice of tax- 
GHaeeasiweliasithelstruceureloMtaxestimposed A full analysis would, of course, 
begin with the general function Gury wland examine its properties. Th 


difficulty with such a completely general approach is that it @6esinoflappear — 
at least at this juncture O ee ee In this 
paper we attempt a [@sslambitiousitask and focisiprimarilylonithe(relationship» 
vemeem excise and income taxation This piecemeal approach has obvious 
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limitations, but we hope that it is sufficient to demonstrate the importance of 
a unified treatment of the choice of tax base and the optimal design of tax rates. 
As a preliminary to this, we review in the next section the main results regarding 
excise taxes viewed in isolation; then, in sections 4-6, we examine the interaction 
with income tax. 


3. Excise taxes and distribution 


The optimal structure of indirect taxation, and particularly whether there 
should be differential rates of tax, is an old issue which has recently be re- 
examined in a series of papers. Much of this literature has ignored differences 
in endowments and has concentrated on efficiency aspects. At the same time, 
it has been recognized that the policy prescriptions would need to be modified 
when distributional considerations were introduced. This aspect of the problem 
was first discussed by Diamond and Mirrlees (1971); their treatment was, 


however, somewhat different from that given below. 
We assume that ea aaa LSAT ATENEO Each 
individual has a well-behaved utility function defined over the n commodities > 
-and labor,’ 
U* = U(x, D). (1) 
E} 
The individual maximizes utility subject to the budget constraint 
qx = wL", (2) 
C) 


where q is the price of the commodity to the consumer, and w” is his after-tax 


(Wage? The solution leads to individual demand and labor supply functions. 
Substituting these back into the utility function gives the indirect utility function 
V"(q, w"). There is no loss of generality (with the assumptions made below) in 
letting labor be the numeraire and in assuming it to be untaxed 

‘tax on labor income is simply equivalent to a uniform commodity tax). This 
will be done throughout the analysis. Finally, we denote by % the total demand 
for good i summed over all individuals (©,x}). 


At this stage it is assumed that fhelonlylfaxeslopenitolthelgovernmentlareD 
proportional excise taxes at the rate f; on commodity i, and that no lump-sum 


ta 4 For simplicity, @ellakelproducer prices as fixed. 
i i We assume that@theygovern- 


3The labor variable may be treated more generally as a vector, including elements such as 
hours, effort, etc. 

4Such a restriction makes sense in the context of the general approach taken in this paper 
only if ‘individuals’ are not directly observable as individuals: e.g., with a lump-sum subsidy, 
they could collect twice under different ‘names’ or with a lump-sum tax they disappear into 
the bush. 
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R=ýtx 2R, (3) 
and that a 
of the Bergson form G(U!, ..., U”), where G is increasing in all its arguments. 


Forming the Lagrangian 
L= avs yes-r], (4) 
h 


straightforward manipulation yields the result that the first-order conditions 
imply5 


East | xh 
h| k = -[1-20(#)} i=],.. „7, (5) 


x; 


Ox; 
St = 
ik ~ (= N 
the compensated price derivative; 
[A 
A +i a’ 


the net social marginal utility of income for household k, using government 
income as numeraire; 


gr — 26 av" 
~ av" Ore’ 


the gross social marginal utility of income (consumption) accruing to house- 
hold A; and 


aR ax! 
yt Da Sp 


the marginal tax paid by household A on receiving an extra dollar of income. 


where 


bi = 


5This by making use of the fact that ¢V/@g; = —x,'a*, where a” is the private marginal 
utility of income of individual h, and of the Slutsky equation 


Ox Ox, 
Gh aia OR 
Fai KE Xi ar’ 


where @x,/@I is the derivative with respect to income (evaluated at J = 0 in this case) and Sx: 
is the compensated price term. 
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In interpretingGy note that therelarelewolefectsonitransferring/aldollaritomhey 
hth household: the direct effect, which is just f*/A measured in government 
< revenue, plus an indirect effect — the effect of the transfer on government income. — 


It may also be noted that the mean (b) is the net value of giving an equal lump- 
sum payment to everyone. Thus, if uniform lump-sum payments or taxes were 
allowed, the government would set them at a level such that b = 1. The implica- 
tions of this are explored in the next section. 


CDHSMEMAHANANSideNoNNS)whas the usual interpretation of the PRSportional? 


(@emaidischedules. We can immediately see that this is nOMOHBErMeCessarilyp 

(he Same for allcommodities. Sufficient conditions for it to be independent 
of i are either that b" be the same for all A or that x*/X; be the same for all 
commodities (there are no goods which are consumed disproportionately by 
rich or poor). In general, where these are not satisfied, the compensated reduc- 
tion in demand with the optimal tax structure is smaller :° (1) the more the good 
is consumed by individuals with a high net social marginal utility of income, 
(2) the more the good is consumed by households with a high marginal propen- 
sity to consume taxed goods. 

Eq. (5) can be rewritten in two ways which will prove useful in the subsequent 

discussion: 


3 Dns = —X,(1—br)), a ee (5°) 
where 
h b” 
n=2(2) (5) © 
and 
2 2 tS = — X1 -—b)-—bo:], i=1l,.. n, (5”) 


where ġ; = r;—1 is the normalized covariance between the consumption of the 
ith commodity and the net social marginal utility of income [a result derived 
independently by Diamond (1975)}. In the first of these formulae, r; is a general- 
ization of the ‘distributional characteristic’ of Feldstein (1972a) and (1972b). 


It shows that if b is large, i.e. if there would be large gains from a uniform lump- 
sum payment, then distributional considerations are to be weighted more 
heavily. 


ŝDiamond and Mirrlees (1971) derived the analogous expression for the uncompensated 
changes. Since the uncompensated reductions in demand with the optimal tax structure are 
not the same even without distributional considerations, to make comparisons with the Ramsey 
results more direct, we have employed compensated derivatives. In the uncompensated form, 
Diamond and Mirrlees have identified a third factor determining the percentage reduction in 
demand: it will be greater the more the demand for the commodity is concentrated among 
individuals for whom the product of the income derivative of demand for that good and total 
taxes paid is large. 


JPE— C 
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The extension of the Ramsey formula given above is relatively general. In 
particular 
ments; 
G@HOditieEsMeed be taxed) (As in the earlier Ramsey analysis, the result does, 
however, depend on there being either constant returns to scale in production 
or 100 per cent profits taxes — see Stiglitz and Dasgupta (1971).) However, tom 


obtain detailed results on the optimal tax structure, we need to make more> 
specific assumptions about the nature of differences between individuals and _ 
, lamm Here, and until section 6, we @ssumelthab 

i i ). For ease of analysis, we 
assume a continuum of individuals, and replace the summation signs in the 
previously derived formulae by integrals. We let F represent the distribution 
function of abilities, where we normalize such that F(co) = 1. The special case 
of the utility function we consider for purposes of illustration is that where all 


individuals have independent compensated demand schedules. Eqs. (5’) and 
(5”) then give 


t; t= —br; d- —b)- boy 
ET &; x g; (7) 


where é; is the weighted average compensated pie elasticity, the weights being 
p consumption of the different individuals.” 


a _____, (7) 
provides a simple adjustment to this formula for distributional considerations. 
The value of r; depends now solely on the social marginal valuation of income 
which goes to them. In particular, i 

. If B is constant, i.e. society is indifferent with regard to the distribu- 
tion, then the optimal tax formula is the familiar one. But if@ie)socialmarginal 
which are primarily consumed by those at the top of the scale.® 

7The first-order conditions need careful interpretation since they may not lead to a unique 
solution. Where the price elasticity varies with q, there may be multiple solutions, and the 
optimal tax structure may involve taxing at different rates two goods with identical demand 
curves. 


®That is, letting r; be a function of p, some measure of inequality aversion with p = 0 
corresponding to no inequality aversion, then r,(0) = 1, for all i, and 


E(x- x) (b-b) ax* 
tL | Z0 as 5,20, 
i.e. households which consume more of x; (relative to mean consumption ¥;) have a higher or 


lower valued net marginal social utility of income. (For the meaning of inequality aversion, 
see Atkinson (1970) and Diamond-Stiglitz (1974).) Because of our normalization, 2; = 4. 


ri(p)—ri(0) = 
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A formula similar to (7) was given by Feldstein (1972a,b), but he did not 
bring out the inherent conflict between equity and efficiency considerations. 
With an additively separable utility function and constant marginal utility of 
leisure, demands depend on the ratio of commodity price to wage. This means 


tha SOE ET oat PPT ENE 
standpoint to be a good candidate for taxation, but that 
SD —— L 


One especially simple case to examine is that where the government maxi- 
mizes the sum of utilities — the classical utilitarian case —- and where the compen- 
sated demand curves have constant elasticity. In table 1, we present the value 
of @; and the associated form of eq. (7) for the Pareto and lognormal distri- 
butions. For the Pareto distribution, it follows that where the government 
would like to make a uniform lump-sum transfer to everyone (b > 1), the tax 


Table 1 
Values of distributional characteristics: Pareto and lognormal distributions. 


(a) Pareto distribution: f = d¥*w-“+ (where it is required that ô > e;): 


Peete i ON 
‘6 ETETAN ltt & 6(1+d-«)° 
(b) Lognormal distribution (where (e’? — 1)*/? is the coefficient of variation): 
See te t (6-1) b—e**’) 
g= el, 1+t, e + & 


rate rises with the elasticity of demand; this is therefore a sufficient condition 
for equity to outweigh efficiency considerations and for goods with a high price 
elasticity to be taxed more heavily. It may also be noted that the magnitude of 
the distributional term falls with 5, or as the distribution of abilities becomes less 
unequal [for the same mean, see Chipman (1974)]. For the lognormal distribu- 
tion, if b > 1 and o is small, then again the distributional considerations 
dominate; but if c is not small, then as the elasticity of demand increases, the 
tax rate may at first increase (for low elasticities, distributional considerations 
are more important) and then decrease (for high elasticities, efficiency domin- 
ates).° 


4. Excise taxes with an optimal linear income tax 


Thus far we have considered indirect taxation in isolation from the rest of 
the tax system, and in particular we have not examined how the possibility of 


This may be seen by expanding the term exp (— 2,07) and first considering terms of order 
o?, and then of order o* (it is assumed that 072, < 1). 
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employing direct taxes affects the optimal structure of indirect taxation. How © 


A first step towards considering the interaction between direct and indirect 
taxation may be taken by a relatively straightforward modification of the 


analysis of the previous section. THeSimipléstiprogressivelincomentaxuismthat 
_ where there is an exemption level and a proportional rate of tax both above and- 
-below this level (the tax below the exemption level being a negative income tax, 
EEEE ERT ET Such a linear 
ax schedule can readily be incorporated into the model we have been 
discussing, since wages are the only source of income and a uniform tax on all 
commodities is equivalent to a proportional tax on wages. The only difference 
therefore is in the exemption level, which can be introduced by supposing that 
the government provides a lump-sum payment identical in amount (£) to all 


individuals (if Æ is negative, it is a lump-sum tax). We assume an additive, sym- 
metric, social welfare function and write the Lagrangian 


L = |? [G{VG, E) +A{t-x—E-R}]dF. (8) 


The indirect utility function now depends on E, where 0V/0E = a, the mar- 
ginal utility of income. The first-order conditions give: 


af la Cati De Se FlaF=0, i=1,..,”, (9) 
ot; 0 Ot; 

oe fe 
-35 T fe- G'a)— Xt SF Fe hr - 0. (10) 


Since B = G’a, (10) is equivalent to b = 1, as the previous section indicated. 
Thus, with an optimal linear income tax, the percentage reduction of consump- 
tion along the compensated demand schedule is simply equal to the normalized 
covariance between consumption of the commodity and the net marginal social 
utility of income (eq. (5") with b = 1). 

If $ were constant, that is, if society were indifferent regarding the distribution, 
then {ti = 0, alli } would provide a solution to the first-order conditions, and 
if there were a positive revenue requirement, it would all be raised by a poll tax 
(E < 0). This is a quite intuitive result, since we should expect that efficiency 
considerations taken on their own would dictate using solely a lump-sum tax. 

‘Where the government is concerned with the distribution of income, i.e. B is a- 

-decreasing function of w, then indirect taxes would in general be employed. The 
question, however, is whether they would be employed with differential rates, 
cle, we have coro IAT eS enone EO 


< a proportional income tax. — 


A.B, Atkinson and J.E. Stiglitz, The design of tax structure 65 


The point at issue may be illustrated by one very special example. Suppose 
that the utility function is quadratic (an example used by Ramsey), that the 
cross-terms are zero, and that the marginal utility of leisure is constant: 


U= F(a) ob. (11) 


In the absence of the income tax, it may be shown that the optimal tax rates vary 
according to a,(1—b), and would in general differ across commodities. However, 
the introduction of an income tax with the exemption level E means that b = 1, 
and that the optimal tax structure is uniform. It follows that no indirect taxation 
need be employed, and that the optimum may be achieved simply through a 
linear income tax. (Another example is the linear expenditure system.) 

Where the utility function is more general, but the compensated demands are 
still independent, we can see from eq. (7) that t;/(1 +t) = —,/é;. We may note 
two features of this result. Firstly, dtgimmpli i idizi 

< normal goods; an increase in the lump-sum subsidy is always superior. Secondly, 
_ the tax rates depend on the level of revenue to be raised only through the depen- 
: : ; : FE : Witha 
constant marginal utility of leisure and G’ = 1, ¢; is independent of the level 
of revenue to be raised — any increase in R is met by a reduction in E. Hence 

_ for sufficiently large R, the tax system is regressive. 

From table 1 we can derive the optimal tax rates in the constant elasticity 
case. For the Pareto distribution, the tax is higher on goods with a higher price 
elasticity (which is also the elasticity with respect to w). With ô = 3.0, the tax 
rates vary from 9.5 percent with € = 0.5 to 16.7 percent with e = 2.0. In the 
case of the lognormal, it is quite possible for the tax rate to fall with €: for ex- 
ample, if (e?e) is sufficiently less than 1 for third and higher powers to be 
neglected, then the tax rate may be approximated by o? —eo*/2, which gives the 
following results (where all individuals work). 


é 


o? 0.5 1.0 2.0 


0.16 15% 15% 13% 
0.24 23% 21% 18% 


The fact that the tax structure may be regressive (i.e. the rates fall with €) may 
appear to conflict with the intuitive notion discussed above that efficiency 
considerations would point to the use of a poll tax and that it is concern for the 
distribution which leads to the use of commodity taxes. However, when distri- 
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butional objectives are relevant, indirect taxes play two roles. Firstly, by taxin® 


secondly, they provide an alternative source of revenue, allowing the regressive 


. In the latter case, 
the revenue would be raised in the distortion-minimizing way, and the final tax 
structure would balance the two sets of considerations. 

Going back to the general formulation (5”), we can see that 


| ° (x Sts) OF = l (x,—¥)) (b—b)dF, (12) 
0 k 0 


so that the reduction of consumption along the compensated demand curve is 
simply equal to the covariance between the consumption of that good and the 
net marginal social utility of income. For small variance, the tax structure may 
be approximated by taking a Taylor series expansion of the RHS of (12), 


Xhi X a = Ges 

dw ðw 
where o2 is the variance of wages (abilities). Thus, the percentage reduction 
(along the compensated demand curve) in consumption is exactly proportional 
to the uncompensated derivative of the commodity with respect to the wage. If 
there is constant marginal utility of leisure and separable demand functions, we 
obtain 


əx b , 
XQ: © qi (Eha Ow > 
so 
ti b , 


ee DY 0 5 
1+t; Ow ” 


independent of i: i.e. to the first order of approximation, there should be uniform 
taxation. 

Expanding ¢; further shows that to the second order of approximation, 
differences in tax rates depend on the concavity or convexity of the demand 
functions (@*x;/0g?) and the third moment of the ability distribution, para- 
meters for which we are unlikely to obtain robust estimates. 

The examples given above show that the results described in the previous 
section m 
In the next section we examine the relationship between direct and indirect 
taxation where the income tax schedule may be freely varied. 
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5. Excise taxes and optimal income taxation 


We assume that the income tax schedule is differentiable,!° but apart from 
that may be of any form. We also allow for the possibility that the tax rate on 
commodities may be a function of the level of consumption.” Theliidividualy 
with wage w faces a budget constraint _ 


E iti) = wL-TWL), (13) 
aa 


and the first-order conditions for utility maximization are‘? 


_dta(-u) a 
= -T P= E ts (14) 


The government maximizes the social welfare function subject to 


Í i [z TAN ron lar =K, 
i 


0 


f | we by oR lar =0. (15) 
0 i 


This problem may be treated in a number of different ways. In the heuristic 
argument which follows, we take x,,...,x, and L as the control variables, 
treating U as a state variable, and making use of the fact that x, depends on 
U, X2,...,X, and L. Moreover, 


dU —U,L =- 
= = =E = - U0, L). (16) 


The Hamiltonian may then be written 


H= [ aor + 2000 X = f —u0Uz, (17) 


10See Mirrlees (1971). In general this need not be the case. For an analysis of such non- 
differentiabilities within the context of this class of ‘screening’ problems, see Stiglitz (1974a), 

11 Actually we could have considered a general tax function of the form T(x, £, w). In fact, 
for this particular problem, the results for the more restrictive, but practically more important, 
tax structure involving separability assumed here are identical to those in which the separability 
is dropped. This may be seen most easily by observing that nowhere in the analysis is the 
separability restriction on the tax function actually used. 

12For an interior solution; we do not consider the case where labor supply is zero, although 
the analysis could easily be modified. 
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where f is the density function. Maximizing H with respect to x;, we obtain as 
necessary conditions 


ax, H Ox, - 
From (14) it is immediate that 
Ox, U; (1 + ti) 
—— = es å— == — m 19 
Hi U, (1+ti) ua 
Thus we can rewrite (18) as 
U; 
d log) — 
Jiss wo ez) a 
1+żí f dL 


Without loss of generality, we set t; = 0. Hence 


U, 
d logt — 

a _ a T) (21) 
ite fF aL 


Bee this analysis we obtain at once an interesting result. Tf the utility” 


cigeaihatiathonnercempedndanationpee vesmpioehsem®) i It is immedi- 
ate that we could have allowed U to depend on n as well, as long as we maintain 


our separability hypothesis: U = U(V(x,,...,%,),£,”). With the greater 
flexibility provided by the nonlinear income tax schedule, the result found for 
special cases in the previous section now holds for much more general utility 
functions. The assumption of separability between consumption and labor may 
well be regarded as a reasonable first approximation for our purpose; and even 
if it is in fact empirically rejected, it is a useful benchmark case.’* From the 
results given above, it follows that goods which are complementary (in the 
Edgeworth, not the more usual Hicksian, sense) with leisure (U;, < 0) will face 
lower tax rates, whereas substitutes face higher tax rates. Finally, it is interesting 
to note that relative tax rates are independent of the social welfare function, 
so that they may be viewed as conditions for constrained Pareto optimality.'* 


1 Ti 


14We are indebted to J.A. Mirrlees for pointing this out in his discussion of the paper at the 
Paris conference. 
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There are three interesting applications of the results given above which 
should be mentioned briefly [see also Atkinson (1974)]. First, if the goods are 
interpreted as consumption at different dates, then the analysis shows that the 
conventional presumption in favor of consumption rather than income taxation 
may be interpreted as assuming separability between leisure and consumption. 
Perhaps a more reasonable structure of preferences in this context is 


U= U,(c,,L)+ U,(c2) > 


in which case whether there should be an interest income tax or subsidy depends — 


-— = . The second application is to the question 


of the differential treatment of safe and risky assets: x; is then treated as pur- 
chases of the ith security. Our theorem then says that avheregtheyindividual) 


A aT (1970) and Atkinson and Stiglitz (1972)]. 


The third application is to the use of quotas of specific allocations for distri- 
buting certain goods. Some economists [e.g. Tobin (1970)] have argued that 
there exist certain inelastically supplied commodities (medical care, at least in 


the short run) where quotas might be desirable. Such quetalsystems an Be» 
viewed as an extreme nonlinear commodity tax-subsidy scheme: below the — 


CRS Re elites ore bestupnsaetn Viewed this way, the question of 
y of quotas is equivalent simply to the question of 
o 


modity. The import of our theorem is that, i ili i 
is satisfied, 


The result does not depend on the supply 
elasticities for the commodities in question.'* 

The basic intuition behind the argument that quotas might be desirable for 
inelastic commodities was that, if commodities are elastically supplied, then 
individuals should be allowed to trade off consumption of one good against 
the other: an individual’s increased consumption of vanilla ice cream cones 
does not deprive someone else of his consumption of vanilla ice cream cones. 
When commodities are inelastically supplied, then there is no production 
inefficiency introduced by quotas. But 


(the conventional exchange model). So long as tastes differ, the use of quotas — 
will result in exchange inefficiency. _ 
But, it might be argued, if we had a separable utility function, .asfitst-best» 


15In our proof, we assume an elastic supply of all commodities, but it is easy to establish 
that, provided profits (rents) are fully taxed, the results are true for any production technology 
(including the limiting case of a perfect inelastically supplied commodity). 
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i . Such 
an argument, though plausible at first sight, failsjtoyrecognizeythe|second=best 


A more plausible argument is that ff Weare able to discrimi atelamOn@ahOseD 
‘with higher incomes by charging them a higher price (e.g. by having price an 


6. Differences in tastes and horizontal equity 


The existence of differences in tastes among individuals of the same ability 
raises issues in the design of the tax structure which we have not yet taken into 
account. In the conventional treatment, the principle of horizontallequity> that 


plays an important role. In this section, we discuss, necessarily briefly, the nature 
of this principle as well as its implications for the design of tax policy. We first 
point out that the principle of horizontal equit 

GHieMiilitananimaxiniim leven whenltasteslarelidentical; next we examine the 
case whereitastesidifier and show thatithelprincipleldoesiiotimply, as some have 
suggested, Giniformtaxation® finally, we consider more generally the status of 
horizontal equity as an objective of government policy. 

The literature on optimal taxation has typically assumed that the redistribu- 
tive goals of the government may be represented by maximizing a Bergsonian 
social welfare function, such as G(U) defined above, and has not discussed the 
relationship between this and the concept of horizontal equity. Some earlier 
authors have taken the view that there is no conflict: ‘the requirements of 
horizontal and vertical equity are but different sides of the same coin’ [Musgrave 
(1959, p. 160)]. However, this need not be so. It is quite possible that the maxi- 
mization of a Bergsonian social welfare function may indicate that individual 


*6Spence (1975) and Weitzman (1974) have discussed this issue in a partial equilibrium 
context. The fact that their results differ from those given here is attributable to the fact that 
the presence of the optimal income tax has important implications for the role to be played 
by other distributive mechanisms, as we have emphasized throughout this paper. 
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is qigasihinis thus: violating conventional notions of horizontal equity [see 
Atkinson and Stiglitz (1976)].7 

The point is that if the feasible set of allocations is not convex (as it may be © 
wh ere TIPET 


n even stronger conflict has been noted by 
Stiglitz (1974b), where horizontallequitymmaysconflictywithythesprincipleyofp 
i i ven before we introduce taste differences, therefore, there 
is a possible conflict between horizontal equity and the maximization of a social 
welfare function of the type usually assumed. 

If we now introduce differences in tastes, the immediate consequence is that 
we must confront the interpersonal comparability question, which we have 
ignored thus far. When individuals have the same indifference curves, it is natural 
simply to use the same cardinal number of the indifference curves for different 
individuals. But when tastes differ, this is no longer so. Even if everyone had 
the same homothetic indifference maps, we must still decide which indifference 
curve for individual 1 corresponds to a given curve for individual 2. 

The point is that the vtilifaranlsystemlevaluatesitaxes um itermsnonthenindi= 
vidual’s ability to derive utility from goods and leisure, and in this respect may — 
be contrasted with the alternative criterion of ‘ability to pay,’ that is, of basing ~- 

taxation on opportunity sets. When the only differences are those in the ability 
to produce, then a utilitarian ethic leads to redistribution from those with ‘better’ 


individual 1 has a higher productivity, so that his budget constraint lies outside 


17Consider the simplest possible case of labor and a single consumption good (C), with two 
identical individuals. We assume that lump-sum taxes (poll taxes) are not admissible. The 
utilitarian problem may be formulated as 
max V(q1)+ V(qg2), 
subject to 
Cy, +72C2 = R, 
with first-order conditions 


ac, 
Vala) = -4 (Citi a), 


where å is the Lagrange multiplier associated with the constraint. It is obvious that 
qı =q2 =q* = 1+", 
where 
21*C(q*) = R, 
satisfies the first-order conditions. But 
32C, a 
t ag tq 
may well be positive at q; = q*, which would mean that this represents a local minimum. 


18 Analogous results in different contexts have been noted by Stiglitz (1974b) and Mirrlees 
(1972). 


real 
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that of individual 2. The ability-to-pay criterion would indicate that individual 
1 paid more tax, but there are obviously numberings of their indifference curves 
which lead to the opposite result with the utilitarian objective. 

In order to contrast these two approaches, let us suppose that tastes may be 
represented by a single parameter, y, so that the indirect utility function may be 
written as V(g, w, y). The utilitarian principle recognizes such taste differences 
as a legitimate basis for discrimination, and the government maximizes G[V(q, 
w, y)]. On the other hand, if we introduce the concept of horizontal equity and 
interpret this as meaning that differences in tastes are not ‘relevant’ character- 
istics on which discrimination ought to be based, then this has two implications, 
Firstly, it introduces a cardinalisation V(1, w, y) = V(1, w), so that only en- 
dowments, w, and consumer prices (normalized at unity before tax) are relevant. 
Secondly, it constrains the government in levying taxes (q # 1) to maintain 


Vg. w, y) = Yq, w). (22) 


Suppose that the government were to adopt this version of horizontal equity; 
what would be the implications for the optimal tax structure? It is popularly 


believed that it would require uniform taxation. Se 
: . . i 
itable.1° This is not however necessarily correct, e 


seen from the following example: 


xi = (1/2z:) 


U= TAG? y E 


(It should be noted that we are assuming that there are no differences between 
people in the marginal utility of leisure, and that g; is independent of y.) Let us 
further assume that A; is independent of y, for i = 3,...,”, and that A, = y. 
The requirement of normalization is then that A,(y) is such that V(1, w, y) = 
V(1, w): i.e. that all those with the same w have the same pre-tax utility. Using 
this, it can be shown that the horizontal equity condition (22) requires that?° 


git = g. 23) 


19Pigou (1947) gives a nice example: ‘When England and Ireland were united under the 
same taxing authority, it was strongly argued that, owing to the divergent tastes of Englishmen 
and Irishmen, it was improper to subject them to the same tax formulae in respect of beer and 
whiskey.’ The tax on spirits, more generally consumed in Ireland, was more than two-thirds 
of the price, whereas the tax rate on beer was only about one-sixth of the price. 

207t may be noted that i & ¥ 1). 


ier 


V(q, w, y) = zé 


A.B. Atkinson and J.E. Stiglitz, The design of tax structure 73 


The condition for horizontal equity is not, therefore, uniform taxation; only if 
cream case ~ would uniform tax rates be horizontally equitable. This may be 


related to the argument made by Pigou (1947, p. 77): 


Suppose that there are two persons of equal income and general economic 
status, that in the aggregate of their tastes they are similar, in the sense that 
they would get equal satisfactions from equal incomes if they were permitted 
to spend them as they chose, but that one likes and purchases commodity A 
and not commodity B, the other commodity B and not commodity A. Sup- 
pose, further, that taxes are imposed upon commodities A and Bin such ways 
that both these persons pay the same amount of tax. It will not necessarily 
foliow that they suffer equal real burdens. if the demand of one for his 
commodity is more elastic than the demand of the other for his, the former 


will suffer the larger hurt. 


The model just described is a very simple one, but it brings out clearly the 
conflict between horizontal equity and the maximization of a social welfare 
function of the Bergson type. For example, where G’ = 1 (the classical utili- 
tarian case), the latter leads to the first-order condition, 

E. _! —br; , 
qi Ei 
as before. This is not in general consistent with the requirement of horizontal 
equity, eq. (23). 


This raises the important issue of the status of the horizontal equity principle. 


(@€quity : ‘it is sometimes said that the horizontal aspect is more basic and less 
controversial’ [Musgrave and Musgrave (1973, p. 199)]. Most authors, including 
Musgrave and Musgrave, go on to argue that neither is more basic than the 
other; however, this ignores the conflict which we have seen to arise between 
the two principles, at least in the form presented here. Faced with this potential 
conflict, it might seem more reasonable to view the social welfare function as 
lexicographic. 


oe ; ; f ae and the 
government a maximizes a Borgsonian coal ee asics subject to 


this constraint. As Pigou (1947, p. 51) put it, ‘the ideal of least sacrifice has to 
be pursued subject to a handicap.’ The optimal structure of taxation, and the 
choice between direct and indirect taxes, will depend on how wide is the range 
of goods covered by constraints such as (23). 


7. Concluding comments 


In this paper, we have attempted to present a framework within which we 
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can evaluate the appropriateness of different tax bases and to apply this frame- 
work to the classical question of the use of direct versus indirect taxation. 

The general framework employed may be summarized as follows. The 
necessity for any form of taxation other than a uniform lump-sum tax arises 
from the fact that individuals have differing characteristics (endowments or 
tastes). If we could observe all relevant characteristics costlessly and perfectly, 
we should beable tolachievelalifirst@best solutiom However, in practice@We have 
to make use of surrogate characteristics, which are related systematically to the 
characteristics on which we would like to differentiate individuals, but which 


are not perfectly correlated and which are, to some extent, under the control 


of the individual. Certain ethical principles, notably those which fall under the 
rubric of horizontal equity, limit further the set of surrogates which may be 
used, Having established an admissible class of characteristics, the problem 
then becomes one of determining which are to be employed (the choice of tax 
base) and the structure of the tax schedule. 

The application of this framework to the direct/indirect tax problem led to 
the following results. Firstly @fithelgovernment had Mo distributionaljobjectives 
and was concerned solely with efficiency, it may employ only direct taxation 
and this would takethe form of apoll'tax) This is a very straightforward prescrip- 
tion, but it has the implication, which runs counter to much popular belief, 
that the use of indirect taxation stems from a pursuit of distributional objec- 
tives. The extent to which indirect taxes are employed to this purpose — that is, 
purchases of different commodities are used as a screening device ~ depends on 
the form of consumer preferences and on the restrictions (if any) on the type of 
income taxation employed. Uf a general incomeltax function may be chosen\by 
thelgovernment, we have shown that, where the ulility filnctionlisiseparable 
between labor and all commodities, no indirect taxes need be employed. In this 
case, the use of consumption of particular commodities as a screening device 
offers no benefit. Finally, we have seen that horizontal equity considerations 
may impose constraints on the structure of taxes which may be levied. 

Throughout the paper, we have stressed the importance of the interactions 
between different taxes, and the fact that @ piecemeal approach may"bemmis= 
leading. In section 4, for example, it was shown that in the quadratic case con- 
sidered by Ramsey (plus constant marginal utility of leisure and independence) 
the introduction of an optimal linear income tax meant that indirect taxation 
was no longer necessary. The Ramsey-style results would, therefore, only be 
relevant where there were constraints on the use of income taxation. Such 


tha ale thic naner chould he 
intcractions are equally a warning that tne results given in tnis paper snouid oe 


treated with considerable caution) For this and other reasons, such as the failure 
to incorporate the costs of administration,™ the theory may belmorelusefullin 
illuminating the structure of the argument than in providing definite answers to 
policysissues. 


21See Heller and Shell (1974) for an attempt to introduce administration costs into the 
analysis of optimal taxation. 
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